Generalizing the Optimality of Multi-step k -Nearest Neighbor Query Processing
نویسندگان
چکیده
Similarity search algorithms that directly rely on index structures and require a lot of distance computations are usually not applicable to databases containing complex objects and defining costly distance functions on spatial, temporal and multimedia data. Rather, the use of an adequate multi-step query processing strategy is crucial for the performance of a similarity search routine that deals with complex distance functions. Reducing the number of candidates returned from the filter step which then have to be exactly evaluated in the refinement step is fundamental for the efficiency of the query process. The state-of-the-art multi-step k-nearest neighbor (kNN) search algorithms are designed to use only a lower bounding distance estimation for candidate pruning. However, in many applications, also an upper bounding distance approximation is available that can additionally be used for reducing the number of candidates. In this paper, we generalize the traditional concept of R-optimality and introduce the notion of RI -optimality depending on the distance information I available in the filter step. We propose a new multi-step kNN search algorithm that utilizes lowerand upper bounding distance information (Ilu) in the filter step. Furthermore, we show that, in contrast to existing approaches, our proposed solution is RIlu optimal. In an experimental evaluation, we demonstrate the significant performance gain over existing methods.
منابع مشابه
Non-zero probability of nearest neighbor searching
Nearest Neighbor (NN) searching is a challenging problem in data management and has been widely studied in data mining, pattern recognition and computational geometry. The goal of NN searching is efficiently reporting the nearest data to a given object as a query. In most of the studies both the data and query are assumed to be precise, however, due to the real applications of NN searching, suc...
متن کاملRanked Continuous Visible Nearest Neighbor Search
Physical obstacles (e.g., buildings, hills, and blindages, etc.) are ubiquitous in the real world, and their existence may affect the visibility between objects and thus the result of spatial queries such as range query, nearest neighbor search, and spatial join, etc. In this paper, we study a novel type of spatial queries, namely, ranked continuous visible nearest neighbor (RCVNN) search, whic...
متن کاملIdentification of selected monogeneans using image processing, artificial neural network and K-nearest neighbor
Abstract Over the last two decades, improvements in developing computational tools made significant contributions to the classification of biological specimens` images to their correspondence species. These days, identification of biological species is much easier for taxonomist and even non-taxonomists due to the development of automated computer techniques and systems. In this study, we d...
متن کاملDART: An Efficient Method for Direction-Aware Bichromatic Reverse k Nearest Neighbor Queries
This paper presents a novel type of queries in spatial databases, called the direction-aware bichromatic reverse k nearest neighbor(DBRkNN ) queries, which extend the bichromatic reverse nearest neighbor queries. Given two disjoint sets, P and S, of spatial objects, and a query object q in S, the DBRkNN query returns a subset P ′ of P such that k nearest neighbors of each object in P ′ include ...
متن کاملProcessing All k-Nearest Neighbor Queries in Hadoop
A k-nearest neighbor (kNN) query, which retrieves nearest k points from a database is one of the fundamental query types in spatial databases. An all k-nearest neighbor query (AkNN query), a variation of a kNN query, determines the k-nearest neighbors for each point in the dataset in a query process. In this paper, we propose a method for processing AkNN queries in Hadoop. We decompose the give...
متن کامل